Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local monitoring setup with OTEL and Grafana LGTM-stack #97

Merged
merged 18 commits into from
May 22, 2024

Conversation

martinothamar
Copy link
Contributor

@martinothamar martinothamar commented Apr 12, 2024

Description

This PR includes a separate Docker compose file to setup OTEL collector and the Grafana LGTM stack for local monitoring
Includes

  • Backends for logs, traces and metrics
  • Correlation is built into OTEL, navigation between logs and traces in Grafana UI is configured
  • Dashboards published by .NET team for .NET 8/Aspire is automatically provisioned
  • Custom dashboard for Altinn apps

Altinn app dashboard:
image
image

ASP.NET core dashboard:
image

Explore logs:
image

Explore traces:
image

Explore metrics:
image

Decisions:

  • How "built in" should monitoring stack be - included in the primary docker compose file or a separate one? The backends are relatively lightweight, but not free
  • Grafanas own LGTM-stack development image can also be considered but it is far less flexible, and harder to match the fully deployed solution: https://github.com/grafana/docker-otel-lgtm

TODOs

  • Instrument app-lib a lot better
  • Continuous profiling setup? It is being considered for OTEL AFAIK, might be too early
  • Dashboard for runtime metrics (exceptions, threads count, lock contention etc)

Related Issue(s)

Verification

  • Your code builds clean without any errors or warnings
  • Manual testing done (required)
  • Relevant automated test added (if you find this hard, leave it and we'll help out)
  • All tests run green

Documentation

  • User documentation is updated with a separate linked PR in altinn-studio-docs. (if applicable)

@martinothamar martinothamar linked an issue Apr 12, 2024 that may be closed by this pull request
@martinothamar martinothamar added the kind/feature-request New feature or request label Apr 16, 2024
@martinothamar martinothamar self-assigned this Apr 17, 2024
@HauklandJ HauklandJ requested a review from SandGrainOne April 22, 2024 09:54
@ivarne
Copy link
Member

ivarne commented Apr 23, 2024

Not a requirement for this PR, but I'm missing a link in the localtest dotnet app to make this more discoverable, and ideally a message informing that local telemetry is not running if it isn't (just send a http request to verify).

Docs does currently not suggest using the Makefile, so adding everything together in a single makefile would be nice. Potentially behind a --profile flag (as we use in --profile localtest).

For the other services we have separate compose files for docker and podman. Would that be required here also?

@sduranc
Copy link

sduranc commented Apr 23, 2024

This is very nice @martinothamar 💯.

How "built in" should monitoring stack be - included in the primary docker compose file or a separate one? The backends are relatively lightweight, but not free

IMO it should be in a separate one.

Grafanas own LGTM-stack development image can also be considered but it is far less flexible, and harder to match the fully deployed solution: https://github.com/grafana/docker-otel-lgtm

What are we missing exactly from running that image? Is it adding dashboards?

@martinothamar
Copy link
Contributor Author

but I'm missing a link in the localtest dotnet app to make this more discoverable

Good point, now there is a Grafana menu item
image

RE:

Docs does currently not suggest using the Makefile, so adding everything together in a single makefile would be nice. Potentially behind a --profile flag (as we use in --profile localtest).

and

IMO it should be in a separate one.

Opted for using compose profiles in the latest commit

  • --profile "monitoring" brings up existing setup + monitoring profile
  • --profile "*" brings up existing setup + every other profile
  • No profile specified brings up the existing setup

What are we missing exactly from running that image? Is it adding dashboards?

I was guessing that configuration would be more of a hassle but looking at it now maybe it isn't. I won't be able to specify command in the compose file, and for dashboards I will have to get the grafana version number, but the rest seems like it should work fine (the actual config files and such). I will give it a shot

Copy link
Member

@ivarne ivarne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good enough for merging as a preview, but had a few thoughts while reading through.

src/Controllers/InfraController.cs Outdated Show resolved Hide resolved
src/Views/Shared/_Layout.cshtml Show resolved Hide resolved
docker-compose.yml Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
@martinothamar
Copy link
Contributor Author

Running out of time a bit here so I'll keep the LGTM stack as is for now. Other than that

  • Grafana is behind proxy (no exposed port)
  • Only OTel GRPC port exposed on OTel collector
  • Some updates to the dashboard (will update screenshot in a bit)
  • Updated README
  • Implemented podman stuff
  • Implemented a simpler run.cmd cross-platform script and documented it as preview

README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@HauklandJ HauklandJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM looks good to me

@martinothamar martinothamar force-pushed the feature/monitoring-setup branch from aa90a20 to 5426cbc Compare May 22, 2024 12:02
@martinothamar martinothamar merged commit 5576ea5 into main May 22, 2024
1 check passed
@martinothamar martinothamar deleted the feature/monitoring-setup branch May 22, 2024 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature-request New feature or request
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Support for OpenTelemetry in localtest
4 participants